51 research outputs found

    The Repository as Data (Re) User: Hand Curating for Replication

    Get PDF
    On August 31, 2010, Yale University’s Institution for Social and Policy Studies launched the ISPS Data Archive (http://isps.yale.edu/research/data). The motivation for the Archive was to capture and preserve intellectual output produced by scholars affiliated with ISPS, to share data and associated research output, and to link to publications and projects. The Archive was developed as a pilot for the university (under the Office for Digital Assets and Infrastructure). It provides a model for customized platforms that meet the needs of one research unit, and otherwise relies entirely on Yale IT and library resources (no third party vendors or tools). The ISPS Data Archive currently holds over 1,000 files for ~55 studies. ISPS has created policies for users and depositors in consultation with Yale’s General Counsel and the IRB, and in line with best practices among leading social science data archives elsewhere (e.g., ICPSR). Files are hand-curated: This means that all files are inspected by RAs for PII data and for useable labeling and all program files are run to validate results. (For more background information, see: http://www.ijdc.net/index.php/ijdc/article/view/212.

    A Repository on a Mission: A Small Research Community Gets Serious about Reproducibility

    Get PDF
    Objective: To describe the process and challenges of creating a replication data archive at the Institution for Social and Policy Studies (ISPS) at Yale University. The Archive provides open access to research data, links data to publications, and ultimately facilitates reproducibility. Description: The ISPS Data Archive is a digital repository for research produced by scholars affiliated with ISPS, with special focus on experimental design and methods. The primary goal of the Archive is to be used for replicating research results, i.e. by using author-provided code and data. The Archive was launched in September 2010 as a pilot for Yale’s Office of Digital Assets and Infrastructure (ODAI) to find solutions relating to storage, persistent linking, long-term preservation, and integration with a developing institutional repository. Results: Before data publication, Archive staff processes data and code files, including verifying replication, adding metadata, and converting to CSV and R. To date, the ISPS Data Archive has published over 750 files for about 45 studies. Conclusions: The development and implementation of the ISPS Data Archive, though outside the library, raises issues familiar to librarians: the need for clear policies from the institution; the challenge of finding support for the provision of high quality services; the complexity of working in close partnership with IT; the need to keep up with fast-paced changes in technology and in user expectations; and the challenge of bringing about change in community norms and practices. Alongside these practical issues, fundamental questions arise about the appropriate role of the university vs. the disciplines when it comes to data archiving, especially in light of the need to comply with requirements from funders and journals. Related publication: http://www.ijdc.net/index.php/ijdc/article/view/21

    A matter of integrity: can improved curation efforts prevent the next data sharing disaster?

    Get PDF
    Wider openness and access to data may be a necessary first step for scientific and social innovation, but as the controversial release of OK Cupid data highlights, open data efforts must also consider the quality and reproducibility of this data. What would it take for data curation to routinely consider quality and reproducibility as standard practice? Limor Peer suggests some future directions to ensure data quality, consistency, and integrit

    The Local TV News Experience: How to Win Viewers by Focusing on Engagement

    Get PDF
    Offers television stations insights to help them engage their audiences, stimulate strategic thinking about their position and role in the market, and connect with viewers in ways that could lead to improved civic involvement

    YARD: A Tool for Curating Research Outputs

    Get PDF
    Repositories increasingly accept research outputs and associated artifacts that underlie reported findings, leading to potential changes in the demand for data curation and repository services. This paper describes a curation tool that responds to this challenge by economizing and optimizing curation efforts. The curation tool is implemented at Yale University’s Institution for Social and Policy Studies (ISPS) as YARD. By standardizing the curation workflow, YARD helps create high quality data packages that are findable, accessible, interoperable, and reusable (FAIR) and promotes research transparency by connecting the activities of researchers, curators, and publishers through a single pipeline

    If it catches my eye: an exploration of online news experiences of teenagers

    Get PDF
    Teenagers aren\u27t much into following serious news online, but news organizations can ? and should ? cultivate their interest by learning how to catch their eyes, diminish their angst, go where they are on the Web, enlist parents and teachers in the cause and help them develop a news persona, according to this report. The report is based on a qualitative, in-depth study of 65 Chicago-area teens conducted in 2007 by Media Management Center. The purpose was to identify what drives online news consumption of teenagers. Researchers found that while serious news ? particularly news of politics, government and public affairs ? is not currently that important to most teens, they are "interestable." They will look at news online if it catches their eye ? with content that interests them, video, the right topics, humorous and weird news, and new things. The report urged news organizations to make "catching the eye" of teenagers the core of a bold new strategy for attracting teens online; to "work over time to fan whatever sparks of interest they may have in news into a more robust flame of interest in various types of news," and to "make a special effort to encourage ? and even increase the number of ? teens who consider it part of their identity to follow and talk about the news.

    Saving Software and Using Emulation to Reproduce Computationally Dependent Research Results

    Get PDF
    Using digital data necessarily involves software. How do institutions think about software in the context of the long-term usability of their data assets? How do they address usability challenges uniquely posed by software such as, license restrictions, legacy software, code rot, and dependencies? These questions are germane to the agenda set forth by the FAIR principles. At Yale University, a team in the Library is looking into the application of a novel approach to emulation as a potential solution. In this presentation, we will outline the work of the Emulation as a Service Infrastructure (EaaSI) program, discuss our plans for integrating the tooling of the EaaSI program into archiving and scientific workflows, and report on early work exploring how emulation relates to the computational reproduction of results from data archived at Yale\u27s Institution for Social and Policy Studies (ISPS) and other research data repositories

    Building an Open Data Repository for a Specialized Research Community: Process, Challenges and Lessons

    Get PDF
    In 2009, the Institution for Social and Policy Studies (ISPS) at Yale University began building an open access digital collection of social science experimental data, metadata, and associated files produced by ISPS researchers. The digital repository was created to support the replication of research findings and to enable further data analysis and instruction. Content is submitted to a rigorous process of quality assessment and normalization, including transformation of statistical code into R, an open source statistical software. Other requirements included: (a) that the repository be integrated with the current database of publications and projects publicly available on the ISPS website; (b) that it offered open access to datasets, documentation, and statistical software program files; (c) that it utilized persistent linking services and redundant storage provided within the Yale Digital Commons infrastructure; and (d) that it operated in accordance with the prevailing standards of the digital preservation community. In partnership with Yale’s Office of Digital Assets and Infrastructure (ODAI), the ISPS Data Archive was launched in the fall of 2010. We describe the process of creating the repository, discuss prospects for similar projects in the future, and explain how this specialized repository fits into the larger digital landscape at Yale

    Preparing to Share Social Science Data: An Open Source, DDI-based Curation System

    Get PDF
    Objective: This poster will describe the development of a curatorial system to support a repository for research data from randomized controlled trials in the social sciences. Description: The Institution for Social and Policy Studies (ISPS) at Yale University and Innovations for Poverty Action (IPA) are partnering with Colectica to develop a software platform that structures the curation workflow, including checking data for confidentiality and completeness, creating preservation formats, and reviewing and verifying code. The software leverages DDI Lifecycle – the standard for data documentation – and will enable a seamless framework for collecting, processing, archiving, and publishing data. This data curation software system combines several off-the-shelf components with a new, open source, Web application that integrates the existing components to create a flexible data pipeline. The software will help automate parts of the data pipeline and will unify the workflow for staff, and potentially for researchers. Default components include Fedora Commons, Colectica Repository, and Drupal, but the software is developed so each of these can be swapped for alternatives. Results: The software is designed to integrate into any repository workflow, and can also be incorporated earlier in the research workflow, ensuring eventual data and code deposits are of the highest quality. Conclusions: This poster will describe the requirements for the new curatorial workflow tool, the components of the system, how tasks are launched and tracked, and the benefits of building an integrated curatorial system for data, documentation, and code

    Committing to Data Quality Review

    Get PDF
    Amid the pressure and enthusiasm for researchers to share data, a rapidly growing number of tools and services have emerged. What do we know about the quality of these data? Why does quality matter? And who should be responsible for data quality? We believe an essential measure of data quality is the ability to engage in informed reuse, which requires that data are independently understandable. In practice, this means that data must undergo quality review, a process whereby data and associated files are assessed and required actions are taken to ensure files are independently understandable for informed reuse. This paper explains what we mean by data quality review, what measures can be applied to it, and how it is practiced in three domain-specific archives. We explore a selection of other data repositories in the research data ecosystem, as well as the roles of researchers, academic libraries, and scholarly journals in regard to their application of data quality measures in practice. We end with thoughts about the need to commit to data quality and who might be able to take on those tasks
    • …
    corecore